Efficient Clustering for High Dimensional Data: Subspace Based Clustering and Density Based Clustering
نویسندگان
چکیده
منابع مشابه
Clustering for High Dimensional Data: Density based Subspace Clustering Algorithms
Finding clusters in high dimensional data is a challenging task as the high dimensional data comprises hundreds of attributes. Subspace clustering is an evolving methodology which, instead of finding clusters in the entire feature space, it aims at finding clusters in various overlapping or non-overlapping subspaces of the high dimensional dataset. Density based subspace clustering algorithms t...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملDetecting clusters in moderate-to-high dimensional data: subspace clustering, pattern-based clustering, and correlation clustering
As a prolific research area in data mining, subspace clustering and related problems induced a vast amount of proposed solutions. However, many publications compare a new proposition – if at all – with one or two competitors or even with a so called “näıve” ad hoc solution but fail to clarify the exact problem definition. As a consequence, even if two solutions are thoroughly compared experimen...
متن کاملDensity-Connected Subspace Clustering for High-Dimensional Data
Several application domains such as molecular biology and geography produce a tremendous amount of data which can no longer be managed without the help of efficient and effective data mining methods. One of the primary data mining tasks is clustering. However, traditional clustering algorithms often fail to detect meaningful clusters because most real-world data sets are characterized by a high...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Technology Journal
سال: 2011
ISSN: 1812-5638
DOI: 10.3923/itj.2011.1092.1105